Comment API Design Evaluation and Latency Budget

Learn how we meet the non-functional requirements through our proposed design of the comment API.

Introduction#

In the preceding lessons, we were able to meet the functional requirements that we set earlier for the comment service API. This lesson focuses on the non-functional requirements and how we meet them. Furthermore, we’ll also explain some tradeoffs that occur while meeting different non-functional requirements.

Non-functional requirements#

Let's discuss how we fulfill the non-functional requirements of the API for the commenting service.

Scalability#

Our API should be stateless to handle a large number of concurrent requests. Thankfully, the HTTP provides us with this ability. We don’t want a stateful API because it would maintain users’ data on the server, which becomes a bottleneck in the way of scalability.

Furthermore, we consider the usage of relational databases as the right choice of database since the comments data is structured; that is, a comment has predefined attributes that can be stored in tabular format. Also, relational databases can be horizontally scaled when required, providing adequate performance in most cases.

The processing of comments is performed asynchronously to rapidly and easily mitigate long queues of operations (requests) heading toward the back-end servers, as shown in the following figure.

Asynchronous versus synchronous operations on comments
Asynchronous versus synchronous operations on comments

In the asynchronous approach, the client-initiated request is validated and authenticated by the API gateway. In the next step, the request is forwarded to the back-end servers, and at the same time, the client is acknowledged. During the execution of the request, the client starts processing other tasks. The client is notified if any error occurs during the processing of the request on the server side.

In the synchronous approach, after successfully validating the request, it’s forwarded to the back-end servers. The back-end server starts processing the request while the client waits for it during the execution until the response is sent back to the client.

Point to Ponder

Question

What happens if the same user issues conflicting (concurrent) commands for the same operation—for example, deleting a comment from their computer and hand-held device?

Hide Answer

Conflicting requests are managed via idempotent HTTP methods. For example, deleting a comment is performed via the DELETE method, which is by default idempotent. Similarly, editing a comment via the PUT method is also idempotent if the parameters are provided appropriately. For non-idempotent operations, such as creating a comment via the POST method, it may cause duplicate comments on a specific post.

For concurrent operations, we have a concurrency controlling mechanism at the backend that deals with concurrent operations and avoids race conditions.

Availability#

The API gateway acts as a bridge between a client and the back-end system. So, when the API gateway fails, the back-end system will be unable to receive and process requests. Therefore, the API gateway’s availability is crucial.

We achieve high availability of our APIs by applying rate limiting, monitoring, and automatic recovery approaches. Rate limiting helps to allocate request quota among users evenly. For example, a user can post a certain amount of comments in a unit of time. Additionally, proper API monitoring and alerting mechanisms help to analyze incoming and outgoing traffic toward our API. Such mechanisms produce statistics that help to capture any potential activity that could halt our system.

Similarly, when an incident occurs, the time it takes to recover can have a huge impact on our system’s availability. It is crucial to have an automatic recovery system to diagnose the problems and recover as soon as possible.

Security#

Our API deals with various operations on comments. Public comments can be accessed by providing only the API key to the gateway. However, to create, edit, delete, or flag a comment, the user needs a login and gets an access token by using the OAuth mechanism and a user ID by using OpenID Connect to authenticate the user identity. The API gateway authenticates the access token and user ID and allows users to perform the relevant operations.

Point to Ponder

Question 2

If a comment system is provided by the same platform where we’ll post comments, which OAuth 2.0 flow should we follow?

Hide Answer

When the comment service is provided by the platform, there is no need to follow an OAuth 2.0 flow because in such cases, we can use the authentication and authorization system provided by the platform.

2 of 2

Reliability#
  • Eliminate single point of failure: To improve the reliability and availability of our API, we first need to eliminate a single point of failure (SPOF). For example, our requests will go through the API gateway, so we need to install a set of servers to handle the requests.
  • Circuit breaker: Another measure that needs to be adopted is to mitigate the risk of cascading failures by utilizing circuit breakers.

“Everything fails all the time. Plan on your applications and services failing. It will happen. Now, deal with it.”

—Werner Vogels, Amazon CTO

Low latency#

We can minimize the API’s latency at various levels through the following:

  • While TLS is a time-consuming task, techniques like the TLS false start greatly reduces the latency to set up a secure channel between the client and server.
  • It’s important to recognize highly rated comments (say, top 10) so that they can be cached and presented through the API gateway by employing high-speed caches. Moreover, we can also cache comments on viral posts to reduce latency.
  • By using geographically distributed back-end servers and the cache associated with them, we bring the service closer to users.
  • We perform the operations on comments asynchronously, which also reduces the client’s perceived latency.

Note: Even if we reduce the latency through various mechanism, there exist tradeoffs between latency and security and we will be compelled to go with the less evil solution. That is, we have to opt for security between the two. A similar tradeoff exists between consistency and availability, and for the present design problem, we opt for availability because it isn’t important for every user to see the same list of comments in the same order (see the CAP theorem for details).

Achieving Non-Functional Requirements

Non-Functional Requirements

Approaches


Scalability

  • Making the API stateless
  • Using relational databases due to the nature of structured data
  • Asynchronous processing of the operations on comments

Availability

  • Applying rate limiting, monitoring, and automatic recovery approaches

Security

  • Adopting the OAuth and OpenID Connect mechanisms

Reliability

  • Eliminating SPOF
  • Using circuit breakers to prevent cascading failures


Low latency

  • Using the TLS False Start mechanism
  • Using high-speed cache in the API gateway
  • Geographically distributed back-end servers
  • Asynchronous processing of the operations on comments

Latency budget#

Let's discuss the response time of the comment service API. To estimate the response time of the API, we consider the requests to retrieve and post a comment. We consider these messages in our estimation because these are relatively larger in size, containing multiple data items. Before calculating the response time, we need to estimate the message size, latency, and processing time.

Note: As discussed in the Back-of-the-Envelope Calculations for Latency chapter, the latency of the GET and POST requests are affected by two different parameters. In the case of GET, the average RTT remains the same, regardless of the data size (due to the small request size). The time to download the response varies by 0.4 ms per KB. Similarly, for POST requests, the RTT time changes with the data size by 1.15 ms per KB after the base RTT time, which was 260 ms.

Message size#

Let's assume that the maximum number of characters allowed in a comment is 10,000. Then, the GET and POST requests will have the following message sizes:

  • GET request size: In the GET request, we limit ourselves to retrieving the top 10 comments because we tend to retrieve the most viral comments in one request. The maximum request size of the GET request varies depending on the industry practices; however, GET requests are usually limited to a size of 2 KB, including various attributes, headers, the post ID, user ID, API key, page information, and so on.

  • GET response size: The maximum size of the GET response will be roughly around 10,000 bytes, including headers, because of the following equation:

Sizecomment×Numbercomments=10,000 bytes×10100 KBSize_{comment} \times Number_{comments} = 10,000\ bytes \times 10 \approx 100\ KB
  • POST request size: Assuming the maximum size of the comment, each POST request would generally not exceed 10 KB. This request consists of various attributes, including the comment text, user ID, post ID, timestamp, and so on.
  • POST response size: Apart from the status code, phrase, and headers, the response to the POST request would consist of the comment ID, timestamp, comment link, and other such attributes. However, the response size is generally limited to 1 KB.

In the four message sizes above, only the GET response and POST request have considerable sizes. Therefore, we will consider only those two as shown below. Furthermore, the sizes of the GET request and POST response messages are already included in the RTTgetRTT_{get} and RTTpostRTT_{post}, respectively, as discussed earlier in the “Back-of-the-Envelope Calculations for Latency” chapter.

Note:

GET response size = 100KBGET\ response\ size\ =\ 100KB

POST request size = 10KBPOST\ request\ size\ =\ 10KB

Response time #

We will now compute the latency and response times of both the GET and POST requests according to the formulation presented in the “Back-of-the-Envelope Calculations for Latency” chapter. The following calculator computes the minimum and maximum response times for the GET method:

Response Time Calculator to List Comments

Enter size in KBs100KB
Minimum latencyf230.5ms
Maximum latencyf311.5ms
Minimum response timef234.5ms
Maximum response timef315.5ms

Assuming the response size is 100 KBs, the latency is calculated by:

Timelatency_min=Timebase_min+RTTget+0.4×size of response (KBs)=120.5+70+0.4×100=230.5 msTime_{latency\_min} = Time_{base\_min} + RTT_{get} + 0.4 \times size\ of\ response\ (KBs) = 120.5 + 70 + 0.4 \times 100 = 230.5\ ms

Timelatency_max=Timebase_max+RTTget+0.4×size of response (KBs)=201.5+70+0.4×100=311.5 msTime_{latency\_max} = Time_{base\_max} + RTT_{get} + 0.4 \times size\ of\ response\ (KBs) = 201.5 + 70 + 0.4 \times 100 = 311.5\ ms

Similarly, the response time is calculated using the following equation:

TimeResponse=Timelatency+TimeprocessingTime_{Response} = Time_{latency}+ Time_{processing}

Now, for minimum response time, we use minimum values of base time and processing time:

TimeResponse_min=Timelatency_min+Timeprocessing_min=230.5 ms+4 ms=234.5 msTime_{Response\_min} = Time_{latency\_min}+ Time_{processing\_min}= 230.5\ ms + 4\ ms = 234.5\ ms

Now, for the maximum response time, we use maximum values of base time and processing time:

TimeResponse_max=Timelatency_max+Timeprocessing_max=311.5 ms+4 ms=315.5 msTime_{Response\_max} = Time_{latency\_max}+ Time_{processing\_max}= 311.5\ ms + 4\ ms = 315.5\ ms

Similarly, we can compute the minimum and maximum response times with the help of the following calculator.

Response Time Calculator to Create a Comment

Enter size in KBs10KB
Minimum latencyf392.4ms
Maximum latencyf473.4ms
Minimum response timef396.4ms
Maximum response timef477.4ms

Assuming the request size is 10 KBs:

Timelatency=Timebase+RTTpost+DownloadTime_{latency} = Time_{base} + RTT_{post} + Download

RTTpost=RTTbase+1.15×Size=260 ms+1.15 ms×10 KBsRTT_{post} = RTT_{base}+1.15\times Size = 260\ ms + 1.15\ ms\times 10\ KBs

Timelatency_min=Timebase_min+(RTTbase+1.15×size of request (KBs))+0.4Time_{latency\_min} = Time_{base\_min} + (RTT_{base} + 1.15 \times size\ of\ request\ (KBs)) + 0.4

=120.5+(260+1.15×10)+0.4=392.4 ms= 120.5 + (260 + 1.15 \times 10) + 0.4 = 392.4\ ms

Timelatency_max=Timebase_max+(RTTbase+1.15×size of request (KBs))+0.4Time_{latency\_max} = Time_{base\_max} + (RTT_{base} + 1.15\times size\ of\ request\ (KBs)) + 0.4

=201.5+260+1.15×10+0.4=473.4 ms= 201.5 + 260 + 1.15 \times 10 + 0.4 = 473.4\ ms

Similarly, the response time is calculated as:

TimeResponse_min=Timelatency_min+Timeprocessing_min=392.4 ms+4 ms=396.4 msTime_{Response\_min} = Time_{latency\_min}+ Time_{processing\_min}= 392.4\ ms + 4\ ms = 396.4\ ms

TimeResponse_max=Timelatency_max+Timeprocessing_max=473.4 ms+4 ms=477.4 msTime_{Response\_max} = Time_{latency\_max}+ Time_{processing\_max}= 473.4\ ms + 4\ ms = 477.4\ ms

Latency and processing time of the GET and POST requests of the comment API
Latency and processing time of the GET and POST requests of the comment API

We have seen that the maximum response time for the GET and POST messages are 315.5 ms and 477.4 ms. Since we know that an API is effective if it responds within one second, we can deem the response time of the comment service satisfactory.

Point to Ponder

Question

Why does the processing time remain the same (4 ms) for fetching one comment or a list of comments?

Hide Answer

There are a couple of reasons why the processing time remains the same for fetching one comment or a list of comments.

  • The back-end servers are powerful enough to execute a query in approximately 1.5 ms, even if the query needs to alter 10,000 records.
  • The networks inside the data center are much faster compared to user-facing networks/Internet.

In this lesson, we described how we met the non-functional requirements we promised in the Requirements of the Comment API lesson. We also estimated the response time of our proposed API for the comment service by taking an example of retrieving the top 10 comments at a time.

API Model for Comment Service

Quiz on Rating API